exp/imp与expdp/impdp对比及使用中的一些优化事项

您所在的位置：网站首页 › ora 39082 impdp › exp/imp与expdp/impdp对比及使用中的一些优化事项

exp/imp与expdp/impdp对比及使用中的一些优化事项

#exp/imp与expdp/impdp对比及使用中的一些优化事项| 来源: 网络整理| 查看: 265

歪果人数码 2022-04-16 exp/imp与expdp/impdp对比及使用中的一些优化事项来源：

　　一.exp/imp与expdp/impdp对比

　　1.1expdp/impdp调用Server端的API在执行操作，是数据库内部的job任务。可以远程使用，但是生成的dump文件存在于服务器上的directory里。

　　1.2exp/imp与expdp/impdp的默认模式和原理不一样

　　1.2.1exp/imp不同模式原理

　　在metalink的这边文章中，提到了exp/imp的不同模式下的工作原理：

　　ParameterDIRECT:ConventionalPathExportVersusDirectPathExport[ID155477.1]

　　StartingwithOracle7release7.3,theExportutilityprovidestwomethodsforexportingtabledata:

　　-ConventionalPathExport

　　-DirectPathExport

　　(1)ConventionalpathExport.

　　ConventionalpathExportusestheSQLSELECTstatementtoextractdatafromtables.Dataisreadfromdiskintothebuffercache,androwsaretransferredtotheevaluatingbuffer.Thedata,afterpassingexpressionevaluation,istransferredtotheExportclient,whichthenwritesthedataintotheexportfile.

　　exp/imp默认会是传统路径，这种模式下，是用SELECT将数据查询出来，然后写入buffercache，在将这些记录写入evaluatebuffer.最后传到Export客户端，在写入dump文件。

　　(2)DirectpathExport.

　　WhenusingaDirectpathExport,thedataisreadfromdiskdirectlyintotheexportsession'sprogramglobalarea(PGA):therowsaretransferreddirectlytotheExportsession'sprivatebuffer.ThisalsomeansthattheSQLcommand-processinglayer(evaluationbuffer)canbebypassed,becausethedataisalreadyintheformatthatExportexpects.Asaresult,unnecessarydataconversionisavoided.ThedataistransferredtotheExportclient,whichthenwritesthedataintotheexportfile.

　　ThedefaultisDIRECT=N,whichextractsthetabledatausingtheconventionalpath.

　　Thisparameterisonlyapplicabletotheoriginalexportclient.ExportDataPump(expdp)usesaDirectPathunloadbydefaultandswitchestoExternalTablemodeifrequired

　　直接路径模式下，数据直接从硬盘读取，然后写入PGA，格式就是export的格式，不需要转换，数据再直接传到export客户端，写入dump文件。这种模式没有经过evaluationbuffer。少了一个过程，导出速度提高也是很明显。

　　1.2.2expdp/impdp不同模式

　　Export/ImportDataPumpParameterACCESS_METHOD-HowtoEnforceaMethodofLoadingandUnloadingData?[ID552424.1]

　　ThetwomostcommonlyusedmethodstomovedatainandoutofdatabaseswithDataPumparethe"DirectPath"methodandthe"ExternalTables"method.

　　(1)DirectPathmode.

　　Afterdatafilecopying,directpathisthefastestmethodofmovingdata.Inthismethod,theSQLlayerofthedatabaseisbypassedandrowsaremovedtoandfromthedumpfilewithonlyminimalinterpretation.DataPumpautomaticallyusesthedirectpathmethodforloadingandunloadingdatawhenthestructureofatableallowsit.

　　(2)ExternalTablesmode.

　　Ifdatacannotbemovedindirectpathmode,orifthereisasituationwhereparallelSQLcanbeusedtospeedupthedatamoveevenmore,thentheexternaltablesmodeisused.Theexternaltablemechanismcreatesanexternaltablethatmapsthedumpfiledataforthedatabasetable.TheSQLengineisthenusedtomovethedata.Ifpossible,theAPPENDhintisusedonimporttospeedthecopyingofthedataintothedatabase.

　　Note:WhentheExportNETWORK_LINKparameterisusedtospecifyanetworklinkforanexportoperation,avariantoftheexternaltablesmethodisused.Inthiscase,dataisselectedfromacrossthespecifiednetworklinkandinsertedintothedumpfileusinganexternaltable.

　　(3)DataFileCopyingmode.

　　Thismodeisusedwhenatransporttablespacejobisstarted,i.e.:theTRANSPORT_TABLESPACESparameterisspecifiedforanExportDataPumpjob.Thisisthefastestmethodofmovingdatabecausethedataisnotinterpretednoralteredduringthejob,andExportDataPumpisusedtounloadonlystructuralinformation(metadata)intothedumpfile.

　　(4)NetworkLinkImportmode.

　　ThismodeisusedwhentheNETWORK_LINKparameterisspecifiedduringanImportDataPumpjob.ThisistheslowestofthefouraccessmethodsbecausethismethodmakesuseofanINSERTSELECTstatementtomovethedataoveradatabaselink,andreadingoveranetworkisgenerallyslowerthanreadingfromadisk.

　　这种模式很方便，但是速度是最慢的，因为它是通过insert，select+dblink来实现的。速度慢也由此可见了。

　　创建DBLINK:

　　/Formattedon2010/12/2311:28:22(QP5v5.115.810.9015)/

　　CREATEDATABASELINKTIANLESOFTWARE

　　CONNECTTOBUSINESS

　　IDENTIFIEDBY

　　USING

　　'(DESCRIPTION=

　　(ADDRESS_LIST=

　　(ADDRESS=(PROTOCOL=TCP)(HOST=IPADDRESS)(PORT=1521))

　　)

　　(CONNECT_DATA=

　　(SID=ORCL)

　　(SERVER=DEDICATED)

　　)

　　)';

　　Dumpfile参数，可以用%U指定

　　expdpxxx/xxxschemas=xxxdirectory=dump1dumpfile=xxx_%U.dmpfilesize=5g

　　这样每个文件5G，xxx_01.dump,xxx_02.dump这样。

　　expdpxxx/xxxschemas=xxxdirectory=dump1network_link=dbl_65dumpfile=xxx_01.dump,xxx_02.dump

　　这样也可以，但不确定xxx_01.dump增到多大才开始写xxx_02.dump文件。

　　ESTIMATE_OnLY=y可以估计文件大小。

　　NETWORK_LINK：这样就可以不必一定在本机expdp，也可以在目标机通过NETWORK_LINK把从文件抽到目标机上。

　　expdpxxx/xxxschemas=xxxdirectory=dump1network_link=foodumpfile=xxx_%U.dumpfilesize=10m

　　或者用impdp+network_link实现无文件导入

　　需要注意，LOB字段可以使用NETWORK_LINK，而long类型字段会报错，

　　ORA-31679:Tabledataobject"xx"."SYS_USER"haslongcolumns,andlongscannotbeloaded/unloadedusinganetworklink

　　1.3网络和磁盘影响

　　expdp/impdp是服务端程序，影响它速度的只有磁盘IO。

　　exp/imp可以在服务端，也可以在客户端。所以，它受限于网络和磁盘。

　　1.4exp/imp与expdp/impdp功能上的区别

　　(1)把用户usera的对象导到用户userb,用法区别在于fromuser=useratouser=userb,remap_schema='usera':'userb'。例如

　　impsystem/passwdfromuser=useratouser=userbfile=/oracle/exp.dmplog=/oracle/exp.log;

　　impdpsystem/passwddirectory=expdpdumpfile=expdp.dmpremap_schema='usera':'userb'logfile=/oracle/exp.log;

　　(2)更换表空间，用exp/imp的时候，要想更改表所在的表空间，需要手工去处理一下，如altertablexxxmovetablespace_new之类的操作。用impdp只要用remap_tablespace='tabspace_old':'tablespace_new'

　　(3)当指定一些表的时候，使用exp/imp时，tables的用法是tables=('table1','table2','table3')。expdp/impdp用法是tables='table1','table2','table3'。

　　(4)是否要导出数据行

　　exp(ROWS=Y导出数据行，ROWS=N不导出数据行)

　　expdpcontent(ALL:对象+导出数据行，DATA_ONLY：只导出对象，METADATA_ONLY：只导出数据的记录)

　　二.使用中的优化事项

　　2.1EXP

　　通过上面的分析，知道采用directpath可以提高导出速度。所以，在使用exp时，就可以采用直接路径模式。这种模式有2个相关的参数：DIRECT和RECORDLENGTH参数。

　　DIRECT参数定义了导出是使用直接路径方式(DIRECT=Y)，还是常规路径方式(DIRECT=N)。常规路径导出使用SQLSELECT语句从表中抽取数据，直接路径导出则是将数据直接从磁盘读到PGA再原样写入导出文件，从而避免了SQL命令处理层的数据转换过程，大大提高了导出效率。在数据量大的情况下，直接路径导出的效率优势更为明显，可比常规方法速度提高三倍之多。

　　和DIRECT=Y配合使用的是RECORDLENGTH参数，它定义了ExportI/O缓冲的大小，作用类似于常规路径导出使用的BUFFER参数。建议设置RECORDLENGTH参数为最大I/O缓冲，即65535(64kb)。其用法如下：

　　如：expuserid=system/managerfull=ydirect=yrecordlength=65535file=exp_full.dmplog=exp_full.log

　　一些限制如下：

　　YoucannotusetheDIRECT=Yparameterwhenexportingintablespace-mode(i.e.whenspecifyingtheparameterTRANSPORT_TABLESPACES=Y).YoucanusetheDIRECT=Yparameterwhenexportinginfull,userortablemode(i.e.:whenspecifyingFULL=YorOWNER=scottorTABLES=scott.emp).

　　--直接路径不能使用在tablespace-mode

　　TheparameterQUERYappliesONLYtoconventionalpathExport.Itcannotbespecifiedinadirectpathexport(DIRECT=Y).

　　--直接路径不支持query参数。query只能在conventionalpath模式下使用。

　　InversionsofExportpriorto8.1.5,youcouldnotusedirectpathExportfortablescontainingobjectsandLOBs.

　　--如果exp版本小于8.1.5，不能使用exp导入有lob字段的biao。不过现在很少有有8版本的数据库了。这点可以忽略掉了。

　　TheBUFFERparameterappliesONLYtoconventionalpathExport.IthasnoeffectonadirectpathExport.ThisBUFFERparameterspecifiesthesize(inbytes)ofthebufferusedtofetchrows.Itdeterminesthemaximumnumberofrowsinanarray,fetchedbyExport.FordirectpathExport,usetheRECORDLENGTHparametertospecifythesizeofthebufferthatExportusesforwritingtotheexportfile.

　　--buffer选项只对conventionalpathexp有效。对于直接路径没有影响。对于直接路径，应该设置RECORDLENGTH参数。

　　TheRECORDLENGTHparameterspecifiesthelength(inbytes)ofthefilerecord.YoucanusethisparametertospecifythesizeoftheExportI/Obuffer(highestvalueis64kb).ChangingtheRECORDLENGTHparameteraffectsonlythesizeofdatathataccumulatesbeforewritingtodisk.Itdoesnotaffecttheoperatingsystemfileblocksize.Ifyoudonotdefinethisparameter,itdefaultstoyourplatform-dependentvalueforBUFSIZ(1024bytesinmostcases).

　　invokingaDirectpathExportwithamaximumI/Obufferof64kbcanimprovetheperformanceoftheExportwithalmost50%.ThiscanbeachievedbyspecifyingtheadditionalExportparametersDIRECTandRECORDLENGTH.

　　--对于直接路径下，RECORDLENGTH参数建议设成64k(65535)。这个值对性能提高比较大。

　　2.2IMP

　　OracleImport进程需要花比Export进程数倍的时间将数据导入数据库。某些关键时刻，导入是为了应对数据库的紧急故障恢复。为了减少宕机时间，加快导入速度显得至关重要。没有特效办法加速一个大数据量的导入，但我们可以做一些适当的设定以减少整个导入时间。

　　(1)避免I/O竞争

　　Import是一个I/O密集的操作，避免I/O竞争可以加快导入速度。如果可能，不要在系统高峰的时间导入数据，不要在导入数据时运行job等可能竞争系统资源的操作。

　　(2)增加排序区

　　OracleImport进程先导入数据再创建索引，不论INDEXES值设为YES或者NO，主键的索引是一定会创建的。创建索引的时候需要用到排序区，在内存大小不足的时候，使用临时表空间进行磁盘排序，由于磁盘排序效率和内存排序效率相差好几个数量级。增加排序区可以大大提高创建索引的效率，从而加快导入速度。

　　(3)调整BUFFER选项

　　Imp参数BUFFER定义了每一次读取导出文件的数据量，设的越大，就越减少Import进程读取数据的次数，从而提高导入效率。BUFFER的大小取决于系统应用、数据库规模，通常来说，设为百兆就足够了。其用法如下：

　　impuser/pwdfromuser=user1touser=user2file=/tmp/imp_db_pipe1commit=yfeedback=10000buffer=10240000

　　(4)使用COMMIT=Y选项

　　COMMIT=Y表示每个数据缓冲满了之后提交一次，而不是导完一张表提交一次。这样会大大减少对系统回滚段等资源的消耗，对顺利完成导入是有益的。

　　(5)使用INDEXES=N选项

　　前面谈到增加排序区时，说明Imp进程会先导入数据再创建索引。导入过程中建立用户定义的索引，特别是表上有多个索引或者数据表特别庞大时，需要耗费大量时间。某些情况下，需要以最快的时间导入数据，而索引允许后建，我们就可以使用INDEXES=N只导入数据不创建索引，从而加快导入速度。

　　我们可以用INDEXFILE选项生成创建索引的DLL脚本，再手工创建索引。我们也可以用如下的方法导入两次，第一次导入数据，第二次导入索引。其用法如下：

　　impuser/pwdfromuser=user1touser=user2file=/tmp/imp_db_pipe1commit=yfeedback=10000buffer=10240000ignore=yrows=yindexes=n

　　impuser/pwdfromuser=user1touser=user2file=/tmp/imp_index_pipe1commit=yfeedback=10000buffer=10240000ignore=yrows=nindexes=y

　　(6)增加LARGE_POOL_SIZE

　　如果在init.ora中配置了MTS_SERVICE，MTS_DISPATCHERS等参数，tnsnames.ora中又没有(SERVER=DEDICATED)的配置，那么数据库就使用了共享服务器模式。在MTS模式下，Exp/Imp操作会用到LARGE_POOL，建议调整LARGE_POOL_SIZE到150M。

　　检查数据库是否在MTS模式下：

　　SQL>selectdistinctserverfromv$session;

　　如果返回值出现none或shared，说明启用了MTS。

　　2.3Expdp/Impdp

　　据泵与exp/imp来说性能有很大的提高，其中影响最大的就是paralle。可以这么来看:expdp/impdp=exp/imp+directmode+paralle。所以，使用数据泵，要想提高速度，就要设置并行参数。

　　先看2个参数：

　　SettingParallelism

　　Forexportandimportoperations,theparallelismsetting(specifiedwiththePARALLELparameter)shouldbelessthanorequaltothenumberofdumpfilesinthedumpfileset.Iftherearenotenoughdumpfiles,theperformancewillnotbeoptimalbecausemultiplethreadsofexecutionwillbetryingtoaccessthesamedumpfile.

　　ThePARALLELparameterisvalidonlyintheEnterpriseEditionofOracleDatabase10g.

　　UsingSubstitutionVariables

　　Insteadof,orinadditionto,listingspecificfilenames,youcanusetheDUMPFILEparameterduringexportoperationstospecifymultipledumpfiles,byusingasubstitutionvariable(%U)inthefilename.Thisiscalledadumpfiletemplate.Thenewdumpfilesarecreatedastheyareneeded,beginningwith01for%U,thenusing02,03,andsoon.EnoughdumpfilesarecreatedtoallowallprocessesspecifiedbythecurrentsettingofthePARALLELparametertobeactive.IfoneofthedumpfilesbecomesfullbecauseitssizehasreachedthemaximumsizespecifiedbytheFILESIZEparameter,itisclosed,andanewdumpfile(withanewgeneratedname)iscreatedtotakeitsplace.

　　如果我们使用如下语句：

　　expdpfull=ydirectory=dumpdumpfile=orcl_%U.dmpparallel=4

　　导出的dump文件和paralle有关系，那么导入也有关系。paralle要小于dump文件数。如果paralle大于dump文件的个数，就会因为超过的那个进程获取不到文件，就不能对性能提高。

　　查看CPU个数：

　　SQL>showparametercpu

　　注意事项：

　　(1)导入的时候可能会停在某个地方，比如在创建索引的时候，可能在一个地方停了十几分钟。这个时候切记不要中断过程。这个时候可能是需要导入的数据比较多。

　　可以在不同时段观察下表空间大小的变化。如果表空间一直在变化，说明还在导入，这个时候耐心等待就好。

　　查看表空间可以用如下SQL:

　　/Formattedon2010/12/2313:14:13(QP5v5.115.810.9015)/

　　SELECTa.tablespace_name,

　　ROUND(a.total_size)"total_size(MB)",

　　ROUND(a.total_size)-ROUND(b.free_size,3)"used_size(MB)",

　　ROUND(b.free_size,3)"free_size(MB)",

　　ROUND(b.free_size/total_size100,2)||'%'free_rate

　　FROM(SELECTtablespace_name,SUM(bytes)/1024/1024total_size

　　FROMdba_data_files

　　GROUPBYtablespace_name)a,

　　(SELECTtablespace_name,SUM(bytes)/1024/1024free_size

　　FROMdba_free_space

　　GROUPBYtablespace_name)b

　　WHEREa.tablespace_name=b.tablespace_name(+);

　　(2)导出导入的过程，尽量避免用ssh连上服务器，在客户端的ssh里执行备份恢复命令。因为这样，如果连接中断，备份也就中断了。可以将备份脚本添加到crontab里。让备份在服务器上执行。这样即使ssh中断，备份和恢复也不受影响。

上一篇：安装Oracl的时候出现的问题

下一篇：RMAN系列（二）,RMAN设置和配置.txt

【本文地址】

exp/imp与expdp/impdp对比及使用中的一些优化事项

exp/imp与expdp/impdp对比及使用中的一些优化事项

今日新闻

推荐新闻